Overview

Dataset statistics

Number of variables11
Number of observations139
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory48.9 B

Variable types

NUM10
BOOL1

Reproduction

Analysis started2020-08-11 23:01:58.744254
Analysis finished2020-08-11 23:02:10.503805
Duration11.76 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

growth_rate has constant value "0" Constant
tests_positive is highly correlated with df_index and 4 other fieldsHigh correlation
df_index is highly correlated with tests_positive and 4 other fieldsHigh correlation
tests_negative is highly correlated with df_index and 4 other fieldsHigh correlation
tests is highly correlated with df_index and 4 other fieldsHigh correlation
patients_hosp is highly correlated with patients_icuHigh correlation
patients_icu is highly correlated with patients_hospHigh correlation
recovered is highly correlated with df_index and 4 other fieldsHigh correlation
rolling_ave is highly correlated with df_index and 4 other fieldsHigh correlation
df_index has unique values Unique
tests_positive has unique values Unique
tests_negative has unique values Unique
tests has unique values Unique
rolling_ave has unique values Unique
tests_pending has 117 (84.2%) zeros Zeros
patients_icu has 18 (12.9%) zeros Zeros
patients_hosp has 13 (9.4%) zeros Zeros
patients_vent has 29 (20.9%) zeros Zeros
recovered has 13 (9.4%) zeros Zeros

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count139
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75.0
Minimum6
Maximum144
Zeros0
Zeros (%)0.0%
Memory size1.1 KiB

Quantile statistics

Minimum6
5-th percentile12.9
Q140.5
median75
Q3109.5
95-th percentile137.1
Maximum144
Range138
Interquartile range (IQR)69

Descriptive statistics

Standard deviation40.26992261
Coefficient of variation (CV)0.5369323014
Kurtosis-1.2
Mean75
Median Absolute Deviation (MAD)35
Skewness0
Sum10425
Variance1621.666667
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
14410.7%
 
4910.7%
 
5510.7%
 
5410.7%
 
5310.7%
 
5210.7%
 
5110.7%
 
5010.7%
 
4810.7%
 
4010.7%
 
Other values (129)12992.8%
 
ValueCountFrequency (%) 
610.7%
 
710.7%
 
810.7%
 
910.7%
 
1010.7%
 
ValueCountFrequency (%) 
14410.7%
 
14310.7%
 
14210.7%
 
14110.7%
 
14010.7%
 

tests_positive
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count139
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64697.467625899284
Minimum71
Maximum178009
Zeros0
Zeros (%)0.0%
Memory size556.0 B

Quantile statistics

Minimum71
5-th percentile402.5
Q112634
median62266
Q3101105
95-th percentile158564.2
Maximum178009
Range177938
Interquartile range (IQR)88471

Descriptive statistics

Standard deviation52409.38191
Coefficient of variation (CV)0.810068521
Kurtosis-0.947107105
Mean64697.46763
Median Absolute Deviation (MAD)46061
Skewness0.4020604077
Sum8992948
Variance2746743313
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
8729410.7%
 
186910.7%
 
769610.7%
 
5205010.7%
 
5640110.7%
 
9198410.7%
 
15291110.7%
 
11860610.7%
 
826810.7%
 
697410.7%
 
Other values (129)12992.8%
 
ValueCountFrequency (%) 
7110.7%
 
9610.7%
 
12710.7%
 
16910.7%
 
22910.7%
 
ValueCountFrequency (%) 
17800910.7%
 
17796410.7%
 
17466010.7%
 
17182110.7%
 
16903410.7%
 

tests_negative
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count139
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean795824.2733812949
Minimum518
Maximum2448856
Zeros0
Zeros (%)0.0%
Memory size556.0 B

Quantile statistics

Minimum518
5-th percentile4746.6
Q1138872
median562066
Q31344280.5
95-th percentile2223234.4
Maximum2448856
Range2448338
Interquartile range (IQR)1205408.5

Descriptive statistics

Standard deviation745635.8535
Coefficient of variation (CV)0.9369352989
Kurtosis-0.7850624495
Mean795824.2734
Median Absolute Deviation (MAD)492220
Skewness0.7122378858
Sum110619574
Variance5.55972826e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
131541310.7%
 
183789910.7%
 
64828010.7%
 
16597310.7%
 
239163610.7%
 
20949110.7%
 
211832310.7%
 
67036410.7%
 
219772910.7%
 
851210.7%
 
Other values (129)12992.8%
 
ValueCountFrequency (%) 
51810.7%
 
90010.7%
 
134910.7%
 
214910.7%
 
286910.7%
 
ValueCountFrequency (%) 
244885610.7%
 
242359510.7%
 
239163610.7%
 
236299910.7%
 
233249510.7%
 

tests_pending
Real number (ℝ≥0)

ZEROS

Distinct count17
Unique (%)12.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.215827338129497
Minimum0
Maximum385
Zeros117
Zeros (%)84.2%
Memory size556.0 B

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile265.3
Maximum385
Range385
Interquartile range (IQR)0

Descriptive statistics

Standard deviation70.06692066
Coefficient of variation (CV)3.646312981
Kurtosis13.7433766
Mean19.21582734
Median Absolute Deviation (MAD)0
Skewness3.835709983
Sum2671
Variance4909.373371
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
011784.2%
 
132.2%
 
232.2%
 
26832.2%
 
27010.7%
 
310.7%
 
26510.7%
 
1010.7%
 
38510.7%
 
12510.7%
 
Other values (7)75.0%
 
ValueCountFrequency (%) 
011784.2%
 
132.2%
 
232.2%
 
310.7%
 
1010.7%
 
ValueCountFrequency (%) 
38510.7%
 
35010.7%
 
27710.7%
 
27010.7%
 
26832.2%
 

tests
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count139
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean860521.7410071943
Minimum589
Maximum2626820
Zeros0
Zeros (%)0.0%
Memory size556.0 B

Quantile statistics

Minimum589
5-th percentile5149.1
Q1151506
median624332
Q31445385.5
95-th percentile2381798.6
Maximum2626820
Range2626231
Interquartile range (IQR)1293879.5

Descriptive statistics

Standard deviation797370.0081
Coefficient of variation (CV)0.926612275
Kurtosis-0.7943924711
Mean860521.741
Median Absolute Deviation (MAD)542333
Skewness0.6946421337
Sum119612522
Variance6.357989299e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
54399910.7%
 
49875910.7%
 
4820810.7%
 
8199910.7%
 
22382210.7%
 
58910.7%
 
142867410.7%
 
12244010.7%
 
215523710.7%
 
52588110.7%
 
Other values (129)12992.8%
 
ValueCountFrequency (%) 
58910.7%
 
99610.7%
 
147610.7%
 
231810.7%
 
309810.7%
 
ValueCountFrequency (%) 
262682010.7%
 
260160410.7%
 
256629610.7%
 
253482010.7%
 
250152910.7%
 

patients_icu
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count101
Unique (%)72.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean207.6474820143885
Minimum0
Maximum381
Zeros18
Zeros (%)12.9%
Memory size556.0 B

Quantile statistics

Minimum0
5-th percentile0
Q1160.5
median217
Q3302
95-th percentile366.3
Maximum381
Range381
Interquartile range (IQR)141.5

Descriptive statistics

Standard deviation118.1399234
Coefficient of variation (CV)0.5689446471
Kurtosis-0.8476325487
Mean207.647482
Median Absolute Deviation (MAD)75
Skewness-0.4692885604
Sum28863
Variance13957.0415
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
01812.9%
 
20621.4%
 
35421.4%
 
35621.4%
 
36221.4%
 
37321.4%
 
6421.4%
 
16521.4%
 
16621.4%
 
30221.4%
 
Other values (91)10374.1%
 
ValueCountFrequency (%) 
01812.9%
 
2410.7%
 
2610.7%
 
2710.7%
 
3810.7%
 
ValueCountFrequency (%) 
38110.7%
 
37810.7%
 
37610.7%
 
37321.4%
 
37010.7%
 

patients_hosp
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count117
Unique (%)84.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1205.9208633093526
Minimum0
Maximum1964
Zeros13
Zeros (%)9.4%
Memory size556.0 B

Quantile statistics

Minimum0
5-th percentile0
Q11063
median1365
Q31643.5
95-th percentile1840.6
Maximum1964
Range1964
Interquartile range (IQR)580.5

Descriptive statistics

Standard deviation590.9157112
Coefficient of variation (CV)0.4900120142
Kurtosis-0.1094508173
Mean1205.920863
Median Absolute Deviation (MAD)282
Skewness-1.043136761
Sum167623
Variance349181.3777
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0139.4%
 
177721.4%
 
134121.4%
 
84921.4%
 
145521.4%
 
174921.4%
 
176921.4%
 
11721.4%
 
176321.4%
 
121021.4%
 
Other values (107)10877.7%
 
ValueCountFrequency (%) 
0139.4%
 
4910.7%
 
6210.7%
 
6610.7%
 
7610.7%
 
ValueCountFrequency (%) 
196410.7%
 
196110.7%
 
191410.7%
 
187410.7%
 
185710.7%
 

patients_vent
Real number (ℝ≥0)

ZEROS

Distinct count87
Unique (%)62.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean107.26618705035972
Minimum0
Maximum252
Zeros29
Zeros (%)20.9%
Memory size556.0 B

Quantile statistics

Minimum0
5-th percentile0
Q141
median107
Q3161
95-th percentile234.3
Maximum252
Range252
Interquartile range (IQR)120

Descriptive statistics

Standard deviation77.51489577
Coefficient of variation (CV)0.722640544
Kurtosis-1.05635318
Mean107.2661871
Median Absolute Deviation (MAD)59
Skewness0.07751885322
Sum14910
Variance6008.559066
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
02920.9%
 
9732.2%
 
3032.2%
 
14132.2%
 
8832.2%
 
11221.4%
 
9621.4%
 
10621.4%
 
9221.4%
 
19921.4%
 
Other values (77)8863.3%
 
ValueCountFrequency (%) 
02920.9%
 
1821.4%
 
3032.2%
 
4121.4%
 
4310.7%
 
ValueCountFrequency (%) 
25210.7%
 
25010.7%
 
24810.7%
 
24410.7%
 
24210.7%
 

recovered
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count127
Unique (%)91.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31446.208633093524
Minimum0
Maximum90149
Zeros13
Zeros (%)9.4%
Memory size556.0 B

Quantile statistics

Minimum0
5-th percentile0
Q12340.5
median25387
Q355557.5
95-th percentile82429.2
Maximum90149
Range90149
Interquartile range (IQR)53217

Descriptive statistics

Standard deviation29523.69785
Coefficient of variation (CV)0.938863511
Kurtosis-1.236664188
Mean31446.20863
Median Absolute Deviation (MAD)24351
Skewness0.4545167393
Sum4371023
Variance871648734.5
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0139.4%
 
3618510.7%
 
3385610.7%
 
416110.7%
 
4666210.7%
 
6714310.7%
 
7220710.7%
 
852310.7%
 
7610.7%
 
634710.7%
 
Other values (117)11784.2%
 
ValueCountFrequency (%) 
0139.4%
 
3110.7%
 
7610.7%
 
10710.7%
 
12110.7%
 
ValueCountFrequency (%) 
9014910.7%
 
8841210.7%
 
8724910.7%
 
8615710.7%
 
8498110.7%
 

growth_rate
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size556.0 B
0
139
ValueCountFrequency (%) 
0139100.0%
 

rolling_ave
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count139
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60620.10791366907
Minimum16
Maximum173907
Zeros0
Zeros (%)0.0%
Memory size556.0 B

Quantile statistics

Minimum16
5-th percentile118.2
Q110100.5
median55400
Q398561.5
95-th percentile153699.2
Maximum173907
Range173891
Interquartile range (IQR)88461

Descriptive statistics

Standard deviation51475.88339
Coefficient of variation (CV)0.8491552583
Kurtosis-0.9808194556
Mean60620.10791
Median Absolute Deviation (MAD)44972
Skewness0.4452803553
Sum8426195
Variance2649766571
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
9215910.7%
 
8686010.7%
 
13295210.7%
 
12373510.7%
 
14805410.7%
 
15342710.7%
 
7867110.7%
 
1415810.7%
 
12175210.7%
 
851010.7%
 
Other values (129)12992.8%
 
ValueCountFrequency (%) 
1610.7%
 
2310.7%
 
3210.7%
 
4210.7%
 
5610.7%
 
ValueCountFrequency (%) 
17390710.7%
 
17086310.7%
 
16785310.7%
 
16480210.7%
 
16182610.7%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

df_indextests_positivetests_negativetests_pendingtestspatients_icupatients_hosppatients_ventrecoveredgrowth_raterolling_ave
0671518495890000016
1796900529960000023
2812713491714760000032
3916921491023180000042
4102292869030980000056
51130637193540250000072
612344425735046010000093
7134094801385521000000121
8144736632270710500000160
9155657619268818400000208

Last rows

df_indextests_positivetests_negativetests_pendingtestspatients_icupatients_hosppatients_ventrecoveredgrowth_raterolling_ave
1291351529112156141023090522531455131799610148054
1301361563002197729023540293021455137809970150725
1311371582532220028023782812971472139823150153427
1321381613652252092024134573021480145834570156149
1331391647952292128024569232921496146841870158950
1341401690342332495025015292781647210849810161826
1351411718212362999025348202851964211861570164802
1361421746602391636025662962901961212872490167853
1371431780092423595026016042621857215884120170863
1381441779642448856026268202711640167901490173907